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ABSTRACT 

The effects of varying degrees of correlation between 
abilities and of various correlation configurations between item 
parameters on ability and item parameter estimation using the three 
parameter logistic model were examined. Ten two-trait configurations 
and one unidimensional test configuration for 30 item tests were 
simulated. Each configuration consisted of a specific item parameter 
configuration and a specific correlation between traits on two 
dimension. Six conditions were simulated for each configuration, 
ranging from an easy to a hard test. The accuracy of item and ability 
parameter estimation was examined using correlations; KR-20 
coefficients and factor analyses were also performed. The factor 
analyses supported a division of the simulated multidimensional data 
set:; into groups according to the discrimination parameter "loads" on 
the two dimensions. The tests either both load heavily on both 
dimensions (both tests are multidimensional), one test loads heavily 
on one dimension and the other loads heavily on the same dimension 
(both tests are unidimensional), one test is unidimensional and one 
is multidimensional, or one test loads heavily on one dimension and 
the other test loads heavily on the other dimension. Results 
indicated the poorest item parameter estimations occur for the 
situation in which one test is unidimensional and one is 
multidimensional. (Author/DWH) 
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GbSThAC i 

I he effects ot varying degrees o-f correlation between 
abilities and o-f various correlation con-figurations between 
item parameters on ability and item parameter estimation 
using the three-parameter logistic model was examined. Ten 
two-trait and one unidimensional test configurations for 
thirty item tests were simulated for 6000 simulees. Each 
configuration cr sists of a specific item parameter 
contiguration and a specific correlation between traits on 
two dimensions. Six conditions were simulated far each 
con+ igurati on , ranging from a very easy to a very hard 
test. The accuracy of item and ability parameter 
estimation was examined using correlations; KR-20 
coetficients and factor analyses were also performed. The 
factor analyses supported a division of the simulated 
multidimensional data sets into groups according to how the 
discrimination parameter "loads" on the two dimensions. The 
tests either both load heavily on both dimensions, (both 
tests are multidimensional), one tests loads heavily on one 
dimension and the other loads heavily on the same dimension 
(both tests are unidimensional), one test is unidimensional 
and one is multidimensional, or one test loads heavily on 
one dimension and the other test loads heavily on the other 
dimension. Ihe results indicate that tne poorest item 
parameter estimations occur for the situation in wmch one 
test is uni di menei onai and one is mui tiaimensional . 
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EXAMINING THE EFFECTS OF MULTIDIMENSIONAL DATA ON 



ABILITY AND ITEM PARAMETER ESTIMATION USING 
THE THREE-PARAMETER LOGISTIC MODEL 



INTRODUCTION 



Several multidimensional models have been proposed and 
some research has been conducted using these models 
(Doody-Bogan & Yen, 1983? Hattie, 1982; McKinley, 1983; 
McKinley & Reckase, 1982, 1983a, 1983b, 1984a Reckase, 
1979; Reckase 8c McKinley, 1982). However, use of these 
models has not yet proven feasible.. 

Most of the i tern response theory ( iR I ) methodol ogy 
that has been developed is applicable only to the limited 
case of one-dimensional data, in which case the assumption 
of unidimensionrli ty is required in order to estimate the 
item and ability parameters. Unfortunately, since in most 
practical applications that assumption is not realistic, 
and useful multidimensional estimation procedures are not 
yet available, practitioners must either fall back on 
traditional methodology or inappropriatel y apply IRT 
methodology while hoping for robustness to violation of the 
uni dimensional l ty assumption. Such robustness remains 
undemonstrated . 

Violation of the unidimensional i ty assumption has been 
suggested as a problem in estimation of item parameters 
(Loyd & Hoover, 1980; Cook 8c Eignor, 1981). It is 
informative to examine the effect of violation o+ the 
um di mensi anal l tv assumption on the estimation of the item 
pcra^ters, a, b. and c. and on the estimation of ability. 
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This effect can then be considered when parameters are 
estimated in situations in which mul ti dimensional x ty is 
known or suspected and no usable estimation procedure for 
multidimensional data can be tound. 
Test Analysis Model 

The statistical model tc be used in this study for 
anal yz i ng i tern responses 1 s the three-parameter 1 ogi stic 
model. This model assumes that an individual's performance 
on a test is influenced by only one important unobservable 
characteristic, 6, which is called a (latent) trait or 
ability. The three-parameter logistic model assumes that 
the probability of a correct response to item i by person j 
with ability level, £, is: 

1 - d 

1 + exp i-l . 7a i C9j-bi ) ) 

where di , b± , and d are the discriminating power, 
dif+iculty, and lower asymptote or guessing parameter of 
item l, respectively. 

The accuracy of item parameter estimation is affected 
by several things, including the accuracy of the estimation 
program UicKinley 8< Reckase, 1980) , the size of the 
calibration sample (Hambleton, Swaminathan, Cook, tEignor, ?< 
Bifford, 1978; Reckase, 1977), and the percent of test 
variance accounted for by the first factor found when the 
data are factor anal ysed (Reckase , 1979) . Vi ol ati an of the 
uni dimensi onal i ty assumption has been suggest" d as a 
orobiem in estimation of item parameters (Loyd Hoover, 
198Q; Cook & Eignor, 1981), 
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Objectives 



The purpose of this research is to investigate the 
robustness of item and ability parameter estimation using 
the three-parameter logistic model to violation ai the 
unidimensional i ty assumption, and to examine the e-f + ects of 
specific multidimensional data con-figurations on parameter 
estimation using the three-parameter logistic model. 
Educational or Scientific Importance o-f the Study 

Most commonly used IRT models assume uni dimensionality. 
However, this assumption is not strictly satisfied by item 
pools in mast practical situations (Lord, 1968). % While the 
assumption of unidimensional ity is * acceptable in the case 
of aptitude tests, that assumption is unrealistic far many 
tests, including most achievement tests (McKinley & 
Reckase, 1982; Reckase, 1979, 1981; Sympson, 1978). Any 
factor that influences an examinee's score on a test, other 
than the one latent trait (ability) assumed far the 
one-dimensional model, will violate the assumption of 
unidimensional ity . Guessing , speededness , fatigue , 
cheating, random answering. or accidently overlooking or 
skipping an item are possible factors. The existence of 
two or more cognitive traits is one such possible factor. 
An achievement test in mathematics might require both 
reading skill and mathematical reasoning. An achievement 
test in science might require both reading and knowledge of 
science facts. If so, the assumption of unidimensioal i ty 
does not appear to be met. Nevertheless, IRT methodology 
has well -known advantages over traditional methodology and 
is applied in situations where it may not be appropriate. 
Hencu it is informative to determine the effects of 
mul tidi mensional i ty on parameter estimation. It is equally 
important to develop guidelines for educators and 
researchers concerned wi th achievement testing . who wi sh to 
benefit from the advantages of iH I methadol ogy- 




ERLC 



6 



Parameter Est i mat i on 

6 

ME I HDD , \ 

the simulations begin with ^the generation of 
two-dimensi onal data sets from a muiti dimensional model 
using prespecified parameters througn an investigation into 
the ef f ects on parameter esti mat i on of var i ous 
multidimensional conditions, and end with a re-examination 
o-f | the accuracy of the parameter estimation through 
crciss-val idation. 
Data Generation 

The main question to be examined in this research is 
how robust parameter estimation .based on the 
three-parameter logistic model is to violation of the 
unidimensional i ty assumption underlying the estimation. The 
uni dimensional ity assumption is violated whenever the 
scores that are being equated are multidimensional in the 
sense tl-^t an examinee' s score on a test is the result of 
more than one latent trait. The data can be the result of 
more than one latent trait and can also vary in the degree 
of correlation that exists between these traits. Since 
infinitely many multidimensional data sets fulfilling these 
requirements are passible, this research project is 
necessarily limited to a few of the possibilities. 

Number of di mensi ons and degrees of cor r el ation . 7ne 
two-dimensional case was chosen for this research as 
typical of published tests and as a starting ooint in 
examining the robustness of parameter estimation to 
violation of the uni dimensional i ty assumpti on. Examining 
all possibilities is beyond the scope of this research. 

I he choice of correlations was limited to that which 
seemed possible for a publ i shed test , tne Comprehe nsi ve 
Tests of Ba sic Skills, Forms L! and V (CTB3/U, C fh'a/ V; 
LTB/Mci3raw-Hi 1 1 , iVfcJl). A correlation ot zero was cnosen 
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to simulate data sets on which Trait 1 and l>" w 2 have no 
correlation. Correlations of .3 and . ' * chosen to 
simulate data sets on which Trait 1 and Trait 2 have low 
correlation, correlations -of .5, .6, and • ?5 were used to 
simuJate two traits that are more highly correlated, and .9 
was used to simulate two traits that are higMy correlated. 
One uni dimensional data set, which is representative av a 
situation in which the correlation between traits is 1.0 
was also generated to be used as a criterion against which 
the analyses of the multidimensional data sets could be 
compared. 

Multivariate model . Two-dimensional data sets were 
generated using the multidimensional model described by 
Doody-Bogan and Yen (1983). This model is an extension ot 
the three-parameter logistic latent-trait model- The 
multivariate logistic modei. is: 

(1 - Ci) 

Pi C*j*) = ci + (2) 

1 + exp <-i . 7 £a A * <£ Jt - b**)) 

where Pi (U Jt ) = P 4 (Oj x . . . ,d Jm ) , the probability a* a 

correct response to item i by a person J whose location in 
an m-dimensional latent space is described by abilities 
yjii £-j3?, • • . ^jdi; djt represents the abiiitv o+ person j on 
trait t, a* * is the discrimination o+ item i with respect 
to latent trait t , b* * is the difficulty of i tern i wi th 
respect to latent trait t, and c* is the guessing parameter 
for item i. Note that when m ■= l , this model reduces to the 
univariate i ogi sti c three-parameter model of Birnbaum 
<l96d). I he model was used with m = l to simulate the 
uni di mensi onal data. Thirty i tern tests were used. 
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Item parameter values and item pool . Discrimination, 
difficulty, and guessing values were chosen as in Doody- 
Sogan and Yen (1983). The base pool consists of 30 items. 
Since the existence of two traits is assumed, two discrimi- 
nation parameters, two difficulty parameters, and one 
guessing parameter per item are required. Two test levels 
were simulated, a harder test (Test 2) and an easier test 
(Test 1). Item parameters for the harder test were estima- 
ted using simulees with higher ability levels than those 
usod to estimate item parameters for tne easier test. 

Six data sets were simulated per data configuration, 
ranging from a vepyfeasy test to a very hard test. For 
different levels of difficulty, 1.0 was added to the base 
b t values to simulate the hardest test and 1.0 was 
subtracted from the base b* values to simulate the easiest 
test. Similarly, .5 was subtracted from and added to the 
base values to simulate slightly different levels of 
difficulty. These differences in difficulty are represented 
as bs - bi sb 0.0, 1.0, and 2.0, where b* is the rean 
difficulty of the harder test (Test 2) and b* is the mean 
difficulty of the easier test ( fest 1). 

For each test configuration, item parameters an, b tl , 
a ' 2 ' b* 2 , and c* were randomly assigned to Traits I and 2 
for both tests 1 and 2, with the restriction that the 
desired correlations between parameters were approximated 
<as closely as possible. Randomisations were tried until 
the desired correlations were obtained. 

T est configurations. Ten two-trait data test 

configurations were chosen to be simulated. Each was 
chosen as being typical of a possible achievement test. 
I able 1 shows the desired trait and item parameter 
correlations -for the simulated data sets. Table 2 snows 
cossible tests where such correlations mignt exist, labie c. 
shows tne item parameters used tor each data contiauration. 
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Configuration 
1 



Lower Level 



Trait 1 



Trait 2 



r(e t ,e a ) s 0 
r(a M b,) = 0 r(a a ,b a ) 3 0 
r(a n a a ) 3 0 r(b,,b a ) 3 0 



Upper Level 



Trait 1 



Trait 2 



saie as 

lower 
level 



r(e»,e a ) 3 .3 

r(a t ,bi) = -.3 r(a a ,b a ) 3 0 
r(a,,a a ) « 0 r(b t ,b a ) = 0 



r(6 M e a ) 3 .3 
r(at,bt) s -.3 r(a a ,b a > ■ 0 
r(at,a a ) 3 .2 r(b t ,b a ) = 0 



r(d t ,e a ) » .4 
r(at,b t ) = .2 r(a a ,b a ) * .5 
r<at,a a ) ■ 0 r(b t ,b a ) = 0 



saie as 

loner 
level 



r<d 1 ,9 a ) s .5 
r(at,b t ) .4 r(a a ,b a ) = 0 
r(a t ,a a ) = 0 r(b M b a ) ■ 0 



r(d t ,e a ) 3 .5 
r(at,bt) » .4 r(a a ,b a ) = .7 
r(a t ,a a ) 3 0 r(bt,b a ) 3 .5 



r(0i|ft a ) 5 iS 
r(a M b,) = 0 r(a a ,b a ) « 0 
r(at,a a ) * -.8 r(bi,b a ) » 0 



saee as 

lower 
level 



r(d t ,e a ) * .5 
r(at,b t ) * -.3 r(a a ,b a ) = 0 
r(at,a a ) = -.5 r(bi,b a i = 0 



saie as 

lower 
level 



r(dt,e a ) 3 .6 
r<a tl bt) 3 0 r(a a ,b a ) » 0 
r(at,a a ) 3 0 r(bi,b a ) 3 0 



saie as 
lower 
level 



r(e M 6a) 3 .75 
r(a»,bi) 3 .4 r(a a ,b 3 ) ' 0 
r(a»,a a ) - 0 rib M b a ) = 0 



saie as 

lower 

level 



(tab le continues.' 
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Table 1 (continued) 



Lower Level Upper .Level 

Configuration Trait I Trait 2 Trait 1 Trait 2 

9 rl&»,9 a ) « .9 saie as 
r<a tl bt) ■ 0 r(i 2 ,D 2 ) = 0 lower 
r(a»,a a ) 3 0 r(b«,t>2) 8 0 level 

10 r(d t |0a) = .9 saee as 
r(a tl b«) 3 0 r(a a ,ba) 3 0 lower 
r(a t ,a 3 ) 8 .8 - r(b t ,ba) 8 .8 level 

U unidiiensional unidieensional 

Note: All ites paraseters are written without the itei subscript, i. 

All ability paraaeters are written without the person subscript, j. 



Table 2 

Applications to Real Data 







Test 


1 


Test 2 




Test Naee 


Trait 1 


Trait 2 


Trait 1 


Trait 2 


1. 


Language Mechanics 


end 

punctuation 


aiddle 

punctuation 


end 

punctuation 


aiddle 
punctuation 


2. 


tlatheaatics Coaputation 


other itefis 


deciaals* 


other iteas 


deciaals 


3. 


Hatheaatics Concepts 
and Applications 


other iteas 


fractions k 
conversions 


other itess 


fractions k 
conversions 


4. 


Hatheiatics Coaputation 


other iteas 


fractions* 


other iteas 


fractions 


5. 


Social Studies 


reading 


aphs 


reading 


graphs 


6. 


Mathematics Coaputation 


other iteas 


deciaals 


other iteas 


deciaals 


7. 


Science 


reading 


science facts 


reading 


science facts 


8. 


Hatheiatics Concepts 
and Applications 


oatheaatics 


reading 


aatheaatics 


reading 


0 


Language Mechanics 


reading 


punctuation 


reading 


punctuation 


lv. 


Reading Comprehension 


reading 


vocabul ary 


reading 


vocabulary 



1— 

* not Masked at chis levqi, 
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Table 3 

Base I tea Paratt er s 



Easy Test 



Hard Test 



Configuration 
1 



AL1 AL2 



I 

.so 

.20 

•J 

I 

.1 



O.L. 

o.fo 

1.00 
.90 

i 

.70 
0.70 
1.10 

m 

I 

.40 
.60 

.00 
.10 



0.BC 
0.8 

f: 
t 



0 

:| 
Ll 

0.60 
1.10 

1: 

1.20 

:1 

0.90 
1.10 
1.30 
0.90 

:j$ 

1.00 
0.70 
1.00 



6U BL2 



8'H 

0.62 

B 

-0.84 
-1.46 

ta 

-1.04 
1.33 
1.02 
0.26 

■ti 

.95 
.53 

«:« 



AH1 AH2 BH1 8H2 



0.90 
0.90 



0.62 

II 

I'M' 
0.95 

-0.84 

-2.00 

n 
mi 

0.49 
0.68 

-0.59 
-0.52 
0.53 

4.8 

:l:S 

-0.24 
0.18 
0.74 
-0.66 
-0.38 
0.11 
0.40 
1.52 



1:1 

0. IS 
-1.25 

0.53 
-0.10 
-0.31 

0.32 

0.11 
-0.84 
-0.24 

0.40 

1.02 
-1.04 
-1.46 
-0.17 
-0.66 
-O.03 

0.62 
-0.45 

y 

-Ml 



CH 

0.16 
0;18 

U\ 

0.19 
0.20 

kit 
l:S 

:S 
:?l 
: 

: 

0.16 
0.18 

N 

w 

t:i 



.80 
0.80 
.10 
.00 

'.80 
.40 
0.70 
.60 
.00 



0 



.80 
0.80 
1.00 
1.00 
1.10 
0.50 
0.90 
1.20 
0.60 
2.00 
1.10 
0.90 
0.70 
0.90 
1.20 
0.60 
1.30 
1.00 

0.90 0.00 



0.00 
0.00 

n 

0.00 
0.00 
0.00 
0.00 
0 

.0 
.00 
.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 

o.oo 

0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 



1.02 

0.40 
-0.59 

0.68 

0.11 
-2.00 
-0.17 

0.32* 

0.18* 
-0.84 
-0.10 

0.49 
-0.38 
-1.25 

t:H 

0.04 

ts 

-0.66 
-1.04 

1.35 
-0.03 
-1.46 

1.52 
-0.45 
-0.52 
-0.24 

0.74 



-0.03 

0.32 
-0.84 
-1.04 
-0.66 
0.49 
0.11 
-1.46 
0.04 
2.00 
.45 
.74 
.38 
.95 
■0.31 
0.62 
-1.2b 
0.18 
0.53 
1.02 
■0.10 
1.35 
0.40 
0.26 
0.68 
•0.52 
1.52 

•o.i: 



0.17 
0.16 
0.21 
0.20 
0.17 

• 2 

.22; 

0.16 
0.20 
0.20 

.20 

0.16 
0.20 
0.20 
0.19 
0.18 
0.21 
0.18 
0.20 
0.17 
0.16 
0.22 
0.19 
0.16 
0.21 



0.80 0.00 

1.60 1.60 

0.70 0.00 

0.90 2.00 

0.70 0.00 

.40 1.40 

.00 6.00 

.10 0.00 

.80 0.00 

.00 0.00 

.00 0.00 

.20 0.00 

:« 

4 

til 

1.30 
0.80 

o.ao 

1.10 

0.80 

\00 
u.60 
0.50 



0.0 
1.8 
1.30 
0.00 
.00 
.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 



-0.45 
1.35 

.m 

0.95 
0.32 
0.11 

1-8 

-2.00 
-1.04 
-0.24 
-0.17 
-0.38 
0.62 
O.M 
-0.31 
.46 
.03 
0.40 
0.49 
0.10 
0.52 
0.25 
0.59 

0.68 
0.53 
0.66 
0.§4 
0./4 



•4 



-0.52 
1.35 
0.26 

-1.25 

-0.45 
0.32 
0.18 

-0.84 

-0.17 
-0.03 

n 

4:8 

-1.04 
0.49 

-0.10 
0.04 

-0.38 
0.11 
0- 74 
1.02 
0.95 

•0.66 
0.68 

-0.24 



0.18 
0.22 
0.16 
0.16 

f:? 

0.18 
0.20 
U6 

I:! 

0.20 
0.20 

fc8 

0.17 
0.20 
0.18 
0.21 
O.li 
0.17 
0.16 
0.20 
0.20 
0.19 
0.17 
0.16 
0.21 
0.22 



(table continues; 
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Table 3 (continued) 



Configuration 
3 



AL1 AL2 



1.1 



Easy Test 
BL1 



fcfl 
W 

).26 
M 



.18 0.04 
.66 -1. *6 
.38 0.18 



3:lf 8:15 8 



J 

-0.84 
-2.00 
-0.45 

0.53 
-0.59 
-0.52 
-1.04 
-0.31 

1.52 
-0.10 

ta 

0.04 
-1.46 T 
1.35 
0.49 
1.02 
0.95 
-0.03 
0.32 
0.74 



6L2 CL 



•I? 
8:1? 



-1.04 
0.68 
0.62 



0. 

.11 

-0.17 
-2.00 

a 



(.26 
J. 45 
-O. 

1.5, 
0.40 
0.74 
-0.3b 
-0.59 



hi 

»• 

.20 
0.20 
0.20 
0.16 
0.22 
0.17 
0.20 
0.20 
0.20 
.16 



0.21 

' 2 2 

0.22 
0.21 
.21 

0.18 

8:1? 



Hard Test 

AH1 AH2 BH1 BH2 



2.00 
1.40 

0. 60 
A. 00 

1. BO 
1.00 
1.10 
0.60 
1.10 
0.80 

0. 50 
1.00 
1.2 

1. '20 
1.60 
0.80 
0.90 
-0.70 

1.30 
0.80 
0.70 
0.90 

1:8 

1.10 
1.00 
0.90 



m 

1.40 
1.00 
1.60 
1.20 
i.lO 
0.80 

hi 

O.70 
1. 00 
1.00 
0.70 
1.00 
0.80 
0.50 
1.80 
0.80 
0.90 
1.20 
O.90 
1.00 
2.00 
1.20 
0.80 
0.60 
0.60 
1.30 
1.10 



-0.45 
•0.84 



.04 
0.74 
0.53 
1.02 
-1.25 
-0.24 
1.52 
-0,66 
-0.03 
-0.31 
0.18 
-0.52 
0.62 
0.26 
-0.59 
-1.46 
-0.17 
1.35 
0.68 



CH 



0.16 
0.20 
0.18 
0.22 
0.16 
0. If 



0.95 
-1.04 
0.40 

b. ^2 0'.20 
-0.38 - 
0.04 

0.16 

8:1? 

0.16 
0.19 
0.20 



0. 
0. 
0. 

o.: 

8: : _ 
: 

0.22 
0.20 



0.50 
0.90 
0.60 
0.70 
0.70 
0.80 

1.20 

O.90 

0.9 

0.6 

0.90 

1.00 

1.00 

1.00 

1.00 

1.10 

1.10 

1.00 

1.10 

1.10 

0.80 

1.20 

1.20 

1.30 

1.40 

1.60 

1.50 

?..;<o 



0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
O.00 
0.00 
0.00 



-2.00 
-0.24 

0.40 
-0.59 
-1.46 
-1.04 

0.53 
-0.84 
-0.17 
-1.25 

0.68 
-0.31 

0.26 
-0.52 

J:« 

1.52 
-0.03 
-0.45 
0.49 
-0.66 
0.11 
0.74 
1.02 
-0.10 
0.04 
0.32 
0.1S 
-0.38 
1.3b 



0.19 
0.20 
0.17 
0.22 
0.16 
O.20 
0.16 
0.19 
0.21 



0.90 
0.90 
1.10 
0.60 
0.70 
1.00 
'0.8' 

°'L 
2.00 

1.00 

1.20 

1.10 

0.9 

1.3 

1.40 

0.80 

1:8 

0.70 
1.10 
1.00 
0.60 

o.eo 

0.90 
1.60 
0.50 
1.10 
1.20 
1.00 
1.00 



1:8 I 



0.00 
0.00 
0.00 
0.00 
0.00 
0.00 

g.oo 

0.00 
0.00 
0.00 
0.00 
0.00 
0.80 
0.00 
0.00 

o.~ 



-1 



0.45 
.04 



O.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
1.00 
0.90 



■0.66 
-0.52 
0.04 
-0.24 
•0.31 
-1.46 
0.40 
1.52 
0.68 
0.1 
•0.8 
-0.10 
0.74 
.59 

lib 
-2.00 
0.11 
0.32 
-0.38 
0.53 
0.95 
0.26 
-1.25 
1.35 
•0.03 
0.49 
0.62 



4 



•0.10 
-2.00 
-1.25 
-0.17 
-0.84 
-0.66 
•0.59 
-1.04 

n 

-0.45 
0.24 

.:B 

0.04 
0.62 
0.95 
0.18 
-1.46 
0.32 
0.40 
-0.03 
0.11 
0.49 
0.53 
-0.33 
0.26 
0.68 
1.52 
1.35 



0.21 
0.16 
0.17 
0.22 
0.22 
0.20 
0.17 
0.20 
0.20 
0.16 
0.22 
0.18 
0.16 
0.16 
0.20 
0.16 
0.17 
0.16 
0.18 
0.20 
0.18 
0.20 
0.20 
0.19 
0.20 
0.21 
0.19 
0.16 
0.19 
0.21 



(table continues) 



13 



Parameter Estimation 

13 



T«bla 3 (continued) 



Easy Test 



Hard Test 



Configuration 
5 



AL1 AL2 



0.00 
0.00 
0.60 
0.70 

8:18 

8:8 

8:88 

J. 00 

!•$ 

o.oo 

0.00 
1.10 
1.10 
o.OO 
1.10 
0.00 
1.20 
1.20 
0.00 



{.00 



1:18 

i.eo 
o.oo 

1.30 
0.00 

f:i| 
US 



1.00 
).00 
1.20 
0.00 



0. 
2. 



.00 

!:S8 

m 

0.00 
0.70 
0.90 
.00 



60 0 

88 m 



BLl 

-0.66 
0.93 
-0.45 
-1.46 
C.ll 

& 

0.40 

M2 

1:J 
4:8 

-0.52 

4 

•1.04 

:8 

1.52 
-1.25 

tfi 

-0.38 
0.6B 



812 CL 



0.17 
0.21 



-1.46 
Q.%2 



K 

0.20 
0. 

8: 
8:o 
:?8 
8:1 

0.21 
0.16 

8:18 

0.20 
0.16 
0.18 
0.22 
" 16 
8 
6 
6 

0.20 
0.21 
0.20 
0.22 



r 



AHl AH2 BH1 8H2 



0.00 
0.60 



1.20 
1.30 
0.00 

0. 00 

1. B0 
0.00 
0.00 
0.00 

1:18 
t« 
1:8 

il 

0.90 

| 

.90 
.00 
.00 
0.70 
0.00 
0 
_0 
0.60 
0.00 



0 



1.46 
1,62 
.40 
,95 

2 :88 



8 



8:« 
?:iS 
-1:1? 
1:« 
4:?S 

0.26 
-O.03 

8:li 
4$ 

-0.52 
-0.10 

-o;S5 
4:8 



-0.( 



.84 



CH 



0. 18 
0.20 
0.17 
0.20 

8:18 

0.22 
0.21 

8:18 
8:18 
1:1) 
t» 

0.22 
0.17 
0.16 

8$ 



0.21 
0.20 



8:18 

0.60 
0.70 
).70 

0.00 



8:8 



loo 8 

0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 

8:88 
8:88 

0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 



-0.10 
-0.59 
.74 
.53 
0.68 
-0.38 
0.62 
-0.66 
0.95 
1.02 
0.49 
0.40 
-1.46 
-0.31 
-0.24 
-0.17 
1.35 
-2.00 
-0.45 
-0.03 
0.26 
1.52 
0.16 
-0.34 
-1.04 
-1.25 
0.11 
-0.52 



-0.10 
-0.45 
0.11 
0.32 
0.62 
-2.00 
-1.46 
-0.03 
1.35 
-0.84 
-0.59 
0.40 
-1.04 
-1.25 
-0.66 
1.02 

8:12 

I'.hl 
-0.31 
0.49 
0.53 
0.04 
1.52 
0.18 
-0.24 
-0.38 
-0.17 
-0.52 



0.16 
0.21 

8:18 
8:1? 

0.17 
0.16 
0.16 
0.16 
0.20 
0.18 
0.17 
0.20 
0.20 
).22 
16 
6 
7 
6 
.20 
.18 
.22 
.22 
0.19 
0.21 
0.20 
0.18 
0.20 
0.20 



0.50 

0.60 

0.60 

0.70 

0.70 

0.80 

0.80 

0.9 

0.8. 

0.90 

0.90 

0.80 

°:8 

.00 
.00 

c 

:1 
:18 

.20 
.80 
.20 
.30 
,40 
.60 
.20 
2.00 



0.1 



-0 66 
-0.52 

M 



■o.io 

-0.38 



8:85 



0.53 
-0.84 
-1.04 
0.26 
0.62 
-0.66 
-0.45 
1.02 
-0.17 
-0.52 
-1.46 
0.68 
0.74 
1.35 

4:11 
8:8 

h 

-0.59 
-0.38 

0.1B 
-O.03 
-0.10 
-0.24 

0.04 
-2.00 

1.52 

0.40 



0.16 
0.16 
0.20 
0.13 

h 

0.22 
0.20 
0.17 
0.16 
0.21 
0.16 

5- 2 9 
0.16 

0.18 

0.22 

8:8 
8:5? 

0.19 
0.18 

0.20 
0.20 
0,21 
0.16 
0.20 
0.19 
0.19 
0.20 



same as Configuration 1 



sane as Configuration 1 
(tadie continue:/ 



er|c 
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Table 3 (continued) 



Easy Test 



Hard Test 



Configuration 
8 



AL1 AL2 611 



BL2 



CI 



AM AH2 



8M 



0.90 


*0.53 


ft. 49 


0.80 

VI wv 


0.68 

VI WW 


0. 18 


0. BO 


1.02 


0.53 


1.00 

ft . VV 


-A. 45 


0.26 


A ■ A V 


5.95 

V • ' w 


-1.25 


1. 00 


0.40 

V. TV 


•0.03 


0.90 


-0.10 
-1 04 

A 1 V t 


0.04 


2. flu 

ip. vv 


-1.46 

A 1 » U 


1.10 
30 

A . WV 


0.26 
1.35 


1.52 
1 .35 


1.00 

A m VV 


-2.00 

• •VV 


0.40 


0.50 


0.49 


0.11 


1.80 


-1.46 


0.32 


0.B0 


-0.39 


0.68 


1.20 


0.74 


1.02 


0.90 


0.62 


-0.59 


1.00 


0.18 


0.74 


1.60 


0.04 


-0.17 


1.20 


-0.24 


-0.45 


0.90 


1.52 


-0.31 


1.00 


-0.52 


-0.10 


0.70 


-0.66 


-1.04 


0.70 




-0.24 


1.20 




0.95 


1.40 


-0.17 


-0.66 


0.60 


0.11 


-2.00 


0.90 


-0.38 


-0.52 


1.10 


-0.31 


-0.38 


0.60 


-0.03 


0.62 


1.10 


0.32 


-0.84 



0.90 
1.60 
1.20 
0.80 
.00 
.40 

.00 
0.50 
1.20 
1.80 
1.00 
1.00 
0.90 
0.90 
0.90 
1.30 
2.00 
1.20 
1.10 
0.70 
0.80 
0.80 
1.10 
0.80 
0.60 
0.60 
1.00 
0.7 
1.1 



0.60 
1.30 
1.00 
0.60 
2.00 
0.90 
0.90 
1.10 
1.20 
0.90 
0.80 
1.10 
0.70 
1.00 
1.20 
0.50 

l:ti 

1.10 
1.80 
0.70 
0.80 
1.00 
1.00 
1.00 
0.80 
1.40 
1.20 
0.30 
1. 10 



BH2 



0.40 
-0.66 
0.74 
-1.25 
1.35 
0.95 
0.62 
-0.10 
-2.00 
0.32 
0.4' 
0.1 
0.68 
•0.59 
0.11 
-0.52 
-1.04 
-1.46 
0.26 
-0.31 
-0.45 
0.53 

] 4 



o. 

0.38 
0.94 
0.17 

.52 
0.04 



CH 



0.16 
0.20 
0.16 
0.17 
0.16 
0.20 
0.20 
0.16 

1:8 

0.17 
0.21 

m 

0.16 
0.18 
0.19 
.0.17 
0.20 
0.21 
0.16 
0.16 
0.22 
0.18 
0.19 
0.22 
0.21 
0.19 
0.20 
0.20 



9 

10 



sate as Configuration 1 



saie as Configuration 1 



0.50 

0.60 

0.60 

0.70 

0.70 

0.80 

0.80 

0. SO 

0.80 

0.90 

0.90 

0.90 

0.9" 

1.0 

1.00 

1.00 

1.00 

:.oo 

i.10 
i.10 
1.10 
1.10 
1.20 
1.20 
1.20 
1.30 
1.40 
1.60 
1.90 
2.00 



0.80 
0.90 
0.70 
0.60 
0.90 
0.70 
1.00 
0.50 
0,90 
1.10 
1.10 
0.60 
1.00 
0.90 
0.80 
1.10 
0 TO 
1,20 
i.OO 
0,80 
1.00 
1.20 
1.10 
1.00 
1.30 
1.60 
1.20 
1.40 
2.00 
1 . 30 



1.52 
-2.00 
-1.04 
0.95 
-0.17 
-0.66 
-0.03 
0.26 
-0.84 
-0.45 
0.32 
0.68 
0.62 
0.11 
0.04 
1.35 
0.40 

0.74 
-1.46 
-0.31 
-0.32 
0.53 
0.49 
"3 
0.21 
1.25 
1.02 
0.13 



-i 



1.35 
-1.46 
-1.04 
1.02 
-0.31 
-0.38 
0.26 
-0.03 
-0.59 
-0.45 
0.62 
0.49 
0.32 
0.04 
0.11 
1.52 
-0.10 
0.40 
-0.84 
0.74 
-2.00 
-0.17 
0.53 
-0.52 
0.68 
-1.25 
-0.66 

-o.:4 
(j.'l 

0.13 



0.21 
0.16 
0.16 
0.16 
0.19 
0.20 
0.20 
0.21 
0.21 
0.22 
0.20 
0.20 
0.17 



).20 
0.19 
0.20 
0.22 
0.13 
0.19 
0.16 
0.20 
0.13 
0.20 
0.22 
0,16 
0. i -5 
0.17 
0.16 
0. i? 



0.50 
0.60 
0.60 
0.70 
0.70 
0.80 
0.90 

o.ao 

C.80 
0.90 
0.90 
0.90 
0.90 
1.00 
1.00 
1.00 
1.00 
1.00 
1.10 
1.10 
1.10 
1.10 
1.20 
1.20 
1.20 
1.30 
1.40 
1.60 
1.30 
2.00 



0.30 
0.90 
0.70 
0.60 
0.30 
0.70 
0.50 
0.60 
1.00 
0.80 
1.00 
1.20 
1.10 
0.80 
1.20 
1.30 
0.90 
1.10 
0.90 
1.40 
1.20 
1.00 
1.00 

0. 90 
1.10 
1.00 
1.10 

1. eO 
2.00 
1.50 



1.25 
0.49 
0.04 
0.24 
0.52 
0.62 
0.40 
1.46 
0.11 
0.53 
1.04 
0.3J 
0.1/ 
1.52 
0.45 
0.32 
0.03 
0.38 
0.66 
0.26 
0.59 
0.95 
0.94 
1.35 
0.18 
1.02 
0.74 
0.10 
0. oc 
■2.00 



-2.00 
0.11 
O.i" 
-0.52 
-0.24 
0.40 
0.62 
-1.04 
0.04 
-0,31 
-0.17 
-1.25 
0.74 
1.35 
-0.66 
0.53 
-0.38 
-0.03 
-0.45 
-0.59 
0.26 
1,02 
-0.94 
1.52 
-0. 10 
0.?5 
0.32 
0.13 
0 . 63 
-1.46 



0.17 
0.16 
0.18 
0.16 
0.20 
0.18 
0.20 
0.21 
0.17 
0.20 
0,16 
0.21 
0.19 
0.21 
0.16 
0.16 
0.17 
0.20 
0.19 
0.22 
0.22 
0.20 
0.13 
0.19 
0.16 
0.22 
v. 20 

O.ia 

0,20 



'.tasle C3nt:nue5l 
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Table 3 (continued) 



. Easy Te>t - Hard Test 

Configuration All AL2 BLj 812 CL JW2 JM± JJH2 _CH 




Note: AL1 and BL1 are discriiination and difficulty for Trait ! on the easy test. ■ 

AL2 and BL2 are discriiination and difficulty for Trait 2 on the easy test. 

AH1 and BH1 are discriiination and difficulty for Trait l on the hard test. 

AH2 and BH2 are discrieinatim and difficulty for Trait 2 on the hard test. 

Base itee parameters' are before +0, +1, or +2 are added to the difficulty parameters. 
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For Configuration 1, both traits were assumed to be 
measured by both tests (Test 1, the easier test or level, 
and Test 2, the harder test or level ) . Zero correlation is 
assumed between Trait 1 and Trait 2. All correlations for 
difficulty and discrimination, both within and between 
traits f are assumed to be zero. This configuration was 
chosen as approximating a situation such as Language 
Mechanics where Trait 1 is end punctuation (i.e., period, 
question mark, exclamation mark), and Trait 2 is middle 
punctuation (i.e. , comma, colon, semicolon) . End 
punctuation is typically taught before middle punctuation. 
It is assumed that no correlation exists between Trait 1 
(end punctuation) and Trait 2 (middle punctuation) since 
the ability to understand how to use periods, exclamation 
points, and question marks appears to be independent of the 
ability to understand commas, colons, and semicolons. 
Theoretically, a student could understand and/or master 
either trait without any knowledge of the other trait. 

The concept of decimals is usually introduced after 
other, more basic, concepts (number, addition, subtraction, 
etc.) have been taught. Therefore. items measuring 
knowledge of decimals usually do not occur in the lowest or 
easiest levels of a series of tests designed to cover 
kindergarten through high school. Configuration 2 was 
chosen to represent such a si tuati on where the second tr ai t 
(such as decimals) is measured only by a few items on the 
harder test. For the harder test, these few items are 
assumed to have equal difficulty on both traits. The 
discrimination for these items is assumed to be medium on 
Trait 1 (other items) and high on Trait 2 (decimals). All 
other items incl ude a range of di scri mi nation val ues on 
Trai t 1 and 0 di scri mi nation on Trai t 2. A 1 ow cor re 1 at ion 
between the two traits (knowledge of decimals and knowledge 
ai oth^r items) is assumed. 
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Configuration 3 involves a situation in which Trait 2 
is measured on the easier test by a -few items, and on the 
harder test by most or all o-f the items. For the easier 
test the discrimination is assumed to be high -for these 
items on both traits, while all other items have a range o-f 
discrimination on Trait 1 and 0 discrimination on Trait 2. 
The harder test has a range of discrimination on both 
traits. This situation might occur if knowledge of 
fractions and fraction Conversions to decimals was one 
trait measured on a te^t (Trait 2 her*) and all other items 
were measuring Trait 1 (not fractions or conversions). For 
the easier test, a few items (the fraction/fraction 
conversion items) are assumed to have high discrimination 
on both traits. It is assumed that these few items are 
included in the test in order to measure knowledge of 
fractions and fraction conversions and hence should be 
highly discriminating on the trait that they are assumed to 
measure. It is assumed that they have hiqh discrimination 
on Trait 1 (addition, for example) since conceivably, a 
test of knowledge of fractions would measure not only 
whether a student grasps the concept of fractions but also 
whether or not the student can add fractions. 

Configuration 4 was chosen as a situation in which the 
second trait is measured only by a few items on the harder 
level. The correlation between traits is assumed to he 
moderate. Difficulty and discrimination are assumed to be 
moderately correlated -for both tests on Trait 1 and highly 
correlated on Trait 2 on the harder test. The difficulty 
parameters are assumed to have a medium correlation on the 
harder test. This configuration was chosen to reflect a 
situation such as Mathematics Computation, where Trait 2 is 
ability with fractions (measured only by a few items on the 
harder test) and Trait 1 is abi 1 ity with nonf r act ion items 
(measured at both levels) • 
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For Configuration 5, both traits are assumed to be 
measured by both tests as in Con-figuration 1, except that 
both traits are not assumed to be measured by all items. 
Some items are assumed to be measuring only Trait 1 while 
haying zero discrimination on Trait 2. Some items are 
assumed to measure Trait 2 only, having zero discrimination 
on Trait 1. A few items are assumed to measure both 
traits. Also, a medium correlation is assumed between 
traits, and discrimination is assumed to be negatively 
correlated across traits. This con-figuration was chosen to 
reflect a situation, such as Social Studies, where the two 
traits might be reading ability and ability to understand 
graphs. A typical Social Studies test usually contains some 
items pertaining to graphs only. A -few items may require 
both reading ability and an understanding o-f graphs. Other 
items require reading only. 

For Configuration 6, it is assumed that both traits are 
measured by both tests and the correlation between traits 
is .5. A few items are assumed to have low discrimination 
on Trait 1 for both tests, low discrimination on Trait 2 
for Test 1, and high discriminati an on Trait 2 for Test 2. 
All other items are assumed to have a mixture of high, 
medium and low discrimination on Trait 1 and zero 
discrimination on Trait 2. This configuration was chosen 
as being similar to a situation such as Mat.' amatics 
Computation, where Trait 2 is ability in computation 
problems involving decimals and Trait 1 is ability in 
computing all problems not involving decimals. A few items 
involving decimals are assumed to have low discrimination 
on Trait 1 (no decimals) for both tests, low discrimination 
on Trait 2 <the decimal trait) for the lower level, and 
high discrimination on Trait 2 for the harder test. The 25 
other items are assumed to not measure decimal ability so 
have zero discrimination on Trait 2. 
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Configuration 7 is the same as Con-figuration 1 (both 
traits are measured by both tests, randomly assign a, b, 
and c> except that a moderate correlation is assumed to 
exist between Traits 1 and 2. This configuration might 
result from a situation such as a Science test where an 
examinee's item responses, could be the result of two 
traits: reading ability and knowledge of science facts. 

Configuration 8 assumes that both traits are measured 
by both tests and that a high (.75) correlation exists 
between the two traits- Discrimination and difficult^ are 
correlated .4 in Trait 1. A test that involves both 
mathematics and reading, such as Mathematics Concepts and 
Applications, might produce such a configuration. 

Configuration 9 is the same as Configuration 1 (both 
traits are measured by both tests) except that a high 
correlation (.9) is assumed to exist between Traits 1 and 
2. This configuration could represent a test such as 
Language Mechanics, where Trait 1 is reading ability and 
Trait 2 is punctuation ability. A high correlation between 
reading and punctuation is assumed since in order to 
understand the mechanics of language, both ability in 
reading and ability in punctuation must be present. 

In a test that might involve both reading and 
vocabulary as separate traits, such as Reading 
Comprehension, a high correlation between traits would 
probably exist. Such a situation is assumed for 
Configuration 10, with discrimination and difficulty highly 
correlated across traits. 

Configuration U is the uni di mensi onal criterion. This 
configuration is simulated by setting m = 1 in the 
multidimensional model used to generate the data. 
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Data conditions, Combi n i ng the above descr i bed 
configurations results in 33 data conditions: 11 (item 
parameter /correlation configurations) x 3 (b 2 - b\ ) values. 

Simulated Data, The Simulated Data sets include three 
groups of simulees for each of the two traits: 2,000 of low 
ability, 2,000 of middle ability, and 2,000 of high 
ability. The 2,000 theta values for each of the three 
levels were generated using the IMSL multivariate normal 
random deviate generator, GGNSti (IMSL, 1979). In each case 
a normal distribution is assumed (© =* -0.57, SQ « 1.0 for 
the low ability group, e" * 0.0, SD = 1.0 for the medium 
ability group, 9 ■ 0.S7, SD ■ 1.0 for the high ability 
group). . These oifferences in means were cnosen to be 
similar to the differences between ability levels in 
published tests (CTBS/U, Levels E & F, and H & J). For the 
33 data conditions, response vectors were generated for 
each of the three groups of observations for tests of 30 
items each. 

Separate sets of data were generated for parameter 
estimation and for cross-validation. For parameter 
estimation, data were generated for each of the 33 
condi tions described above. Thirty— three new sets were 
generated to be used for cross-validation purposes. 

Response vectors, Usi ng the pr especi f i ed M true 11 
parameters (a A1 , a* a, b*. % , b ise , c 4 , and the 

multidimensional model, P*<9^i,9 4 a) was computed c or each 
observation. From these Pi (6ji,6ja) values, (0,1) 
responses, u t jw, were generated for each item i, simulee j , 
and test k, where Uu* is 1 if a random number is less than 
or equal to Pi (Qji^ja) or 0 otherwise. The r*«iaum number 
was generated from a uniform distribution using IMSL 
subroutine GGUBS (IMSL, 1979). 
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Responses were generated -for all simulees -for both 
tests (Test 1 and Test 2). For item parameter estimation, 
only responses to Test 1 were used for the low ability 
group and only responses to Test 2 were used -for -he high 
ability group. For the medium ability group, only 
responses to the anchor test were used. Responses -for all 
simulees to both tests were used to examine the factor 
analyses. 
Data verification 

The means and standard deviations of the number — correct 
scores, i tern di f f iculties <p-values) , and the KR-2C test 
reliability coefficients from the simulated tests were 
examined in order to verify that the simulations are 
realistic. 

Verifying Multidimensional^ ' 

In order to determine whether the generated data 
accurately simulate real data, the following factor 
anUyses were performed: principal component analysis of 
tetrachoric correlations and principal factor analysis of 
phi coefficients (McKinley & Reckase, 1982; Reckase, 1979). 
Both principal component analysis of tetr acholic 
correlations and principal factor analysis of phi 
coefficients were used. 

The factor analyses were examined in terms of the 
proportion of variance accounted for by the first factor. 
This is based on the assumption that a set of items is 
uni dimensional if a large amount of the variance is 
accounted for by the principal factor or component. For 
this study, a procedure similar to that suggested by Lord 
and Novick (1968, pp. 381-382) for evaluating 
unidimensionality by performing a principal axis factor 
analysis was used. The first four factors were extracted 
using estimated communal i ti es in the diagonal. (The 
diagonal values in the correlation matrix are replaced by 
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the squared multiple correlation of each variable with ail 
other variables.) The items may be considered as arising 
from a tridimensional latent space if the first common 
factor accounts for a "large" proportion of the common 
variance and if all factors after the first account for 
much smaller and approximately equal proportions of the 
common var i anc e . 

Determination of whether the first two factors account 
for a "large 11 proportion of the common variance was done by 
comparing the data generated by the multidimensional model 
with that generated by the uni dimensional mod^i * The 
deviation of the multidimensional data from the 
uni dimensional data was then determined by comparing the 
percent of variance accounted for by the first two factors 
in both sets of data. 
Par amet er Est i mat i on 

For eacn of the 33 data conditions, item parameters 
were estimated with LOG I ST (Winger sky, Barton, & Lord, 
19B2) . Item parameters for Test 1 were estimated using 
responses from the low and medium ability groups. 
Similarly, item parameters for Test 2 were estimated using 
responses from the medium anc hiqh ability groups. This 
allows combined samples of 4,000 simulees, a sample size 
that has been found to be adequate for obtaining very 
stable item parameter estimates (Yen, 1983). Since separate 
pairs of LOGIST runs were made for each of the 33 data 
conditions, the result is two sets of estimated item 
parameters and estimated thetas per pair of tests. Far each 
of these 33 conditions, the accuracy of the parameter 
estimates was examined. 
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Accura cy of the Parameter Estimat ion 

A desirable characteristic of a parameter estimation 
procedure is the ability to obtain accurate item 
parameters. For use of IRT estimation procedures for the 
one-dimensional case, the assumption of uni dimensional i ty 
is required in order to estimate the parameters for a given 
set of items and the examinee's trait levels (Lord & 
Novick, 1968, Ch. 16). Violation of this assumption has 
been suggested as a problem in estimation of ivam 
parameters (Loyd & Hoover, 1980? Cook & Eignar, 1981). 

The data used here are known to exhibit 
multidimensionality. Therefore, it is informative to 
examine the effect of this multidimensionality on the 
estimation of the true parameters, a, b, c, and e. ' This 
effect can then be considered when the estimated item 
parameters, a, b, and c, are used to perform an equating or 
for other purposes. 

Although in real life situations tiie real parameters 
are not known, one of the advantages of using simulated 
data is that the "real" item parameters are known; the 
"r-al" item parameters are those used to generate the data. 
Hence comparisons between real and estimated parameters can 
be made and such comparisons can be used to examine the 
accuracy of the estimation procedure. 

Within level parameter estimation. Examining the 

accuracy of the parameter estimations within levels 
involves comparing in some way the multidimensional true 
(generating) parameters and the uni dimensional estimates. 
The approach used by Yen (1984b) is the method used here. 
P* (£*) « P t (e ht ,9 kS ) if 



c i 9 c 4 , and ( 4 ^ 



" b 1= ) . (5; 
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The closed form relationship betwee.i the unidimensional 
estimated parameters and the multidimensional true 
parameters is approximated by finding the unidimensional 
parameters that minimize the sum of the squared differences 
between the two sides of Equation 5. Then 



Atibii + aiab 



bi 



a* 



(6) 



A A 

a*b* 



<7> 



a* » aiir(d XT d) 



a ia r <9 3 ,9> , and 



(8) 



A A 
9* ■ 



k2 



* a* 



(9) 



Equations 4, 6, a, and 9 give the approximation of the 
relationship between the unidimensional estimated 
parameters and the multidimensional generating parameters. 
Within Level Comparisons 

The accuracy of the estimation procedure was examined 
by comparing both sets of true item parameters, from Test 1 
and from Test 2, to the estimated item parameters, and by 
comparing true thetas to the estimated thetas using 
correlations. Estimated bi val ues were compared with each 
of the two sets of true ti± values (b*i and bi 2 ) for both 
traits. Similarily for a* values. The estimated Ci values 
were compared with the one set af true c 4 values. The 
estimated thetas <y) were compared with the true thetas (Mi 
and for the two trai ts- 
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Cross Validation 

The Cross Validation data were generated in the same 
manner as were the Simulated Data (i.e. using the same data 
configurations and item parameters, but different seed 
numbers for the data generation). The Cross Validation 
data sets consist of a pooled group of three sets of 
observations! 2,000 of low ability, 2,000 of middle 
ability, and 2,000 of high ability. Response vectors and 
p-values were also generated as for the Simulated Data. 

Using the fixed item parameters that were estimated 
from the original data, thetas were estimated for the Cross 
Validation data. In the first run, thetas were estimated 
for Test 1 using item responses for all three ability 
levels on Test 1 and not reached <NR) for Test 2. Item 
parameters were fixed at the values estimated in the 
parameter estimation runs for the original data. 
Similarily, in the second run, thetas were estimated for 
Test 2, using item responses on Test 2, NR on Test 1, and 
fixed item parameters. The accuracy of the item and ability 
parameter estimates was then examined in the same manner as 
for the Simulated Data. 



• RESULTS 

True item Parameters 

All desired and attained correlations among the item 
parameters used for generating the two-trait data are 
within +.10 of the desired correlations, hence the attained 
correlations appear to be acceptable. 
S inuj 1 a t ed Th et as 

Attained correlations between generated thetas on both 
traits are all within .04 of the desired correlations. 
Mean thetas are all within .08 of the desired mean ability 
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far all three ability levels and the standard deviation of 
the generated true thetas are all within .05 of the desired 
standard deviations. The simulated thetas appear to be 
acceptable for Simulated and the Cross Validation data. 
Simulated Item Responses 

Table 4 contains means and standard deviations for 
number-correct scares for the pairs of simulated tests for 
Simulated and Crass Validation data. The simulated tests 
appear tb be realistic, although for the b» - b * ~ 2 
conditions , Tests 1 and 2 differ a great deal in 
difficulty. Recall that Test • 1 and Test 2 are simulated to 
have* equal difficulty when b a - b* » 0. When b a - bi » 1 
or 2, Test 2 is the harder of the pair of tests. Tests 1 
and 2 have very similar difficulties for all configurations 
when b 3 - bi =0. For all configurations, Test 1 appears 
easiest and Test 2 hardest when b a - b t « 2. The Cross 
Validation data follow the same pattern. 

Table 5 contains the KR-20 values for the pairs of 
tests for the Simulated and the Cross Val idation^ data. The 
KR-20 values range from .74 to .94. With the exception of 
Configuration 3 and the medium+high ability group of 
Configuration 5, all KR-20 values decrease or else increase 
at most .01 as the tests go from easiest to hardest. 

For Test 1, Configuration 3 has only five items that 
measure Trait 2. These five items have high discrimination 
an both traits. Test 2 measures both traits with all 
items. When b 2 — bi * 0, the KR-20 for Test 1 is .05 to 
.06 less than the KR-20 for Test 2. This contrasts with 
the .00 to .02 differences for all other configurations. 
Also, for bse - bi = 0, the Test 2 KR-20 is larger than any 
Test 1 KR— 20 in Conf i gurati on 3. 

Configuration 5 al so breaks the trend of KR-2U 
decreasing wi th increasing test difficulty by having its 
next to smal lest KR-20 on the easx est test for the 
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medium+high group of simulees. The overall trend appears 
to be that the hardest test -for each con-figuration (Ha - b x 
- 2, Test 2) has the smallest KR-20. 

Another overall trend is that Configurations 1 , through 
6 and the unidimcnsional configuration have ranges of KR-20 
values from the upper 70s to the upper 80s and loner 90s 
for vhe low+medium group and a range of 80s to upper 80s 
and lower 90s for the medium+high group. However, 
Configurations 7 through 10 have overall higher KR-20 
values. The low+medium group ranges from the lower 80s to 
lower 90s and the medium+high group are all in the lower 
90s. 

The Cross Validation KR-20s follow the same patterns as 
the Simulated Data. Ail cross validation KR-20& 
are within .02 of the original data KR-20s. 
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Tables 6 through— S-Coataui the results nf . the -factor 
analyses for the Simulation Data and -for the Cross 
Validation. Table 6 contains the correlations between the 
■first two -factors -for the oblique rotations. There appears 
to be no pattern in the correlations that discriminates 
between the tridimensional criterion and the 
multidimensional data, con-figurations. All correlations 
range between .33 and .60 for the principal components 
analyses (PCA), and between .53 and .76 -for the -factor 
analyses using squared multiple correlations in the 
diagonals (SMC) . 

A -few patterns appear. For the PCA, all correlations 
for Test 2 decrease as the test gets harder (b a *- b» 
increases -from 0 to 1- to 2), and most correlations for Test 
1 decrease as the test gets easier (b a - bi increases -from 
0 to 2) . Also -for PCA, the smallest correlations in each 
configuration ,are far the condition where b 2 - bi =2. This 
is also 'true for most of the configurations for SMC. FQr 
both types of analyses, Configuration 4 has the highest 
correlations per condition and Configuration 5 the lowest. 

Configuration 2 has the greatest range of correlations, 
with a spread of .22 points (.33 to .55) on PCA. compared to 
.05 to .11 for all other configuration?, and a spread of 
.13 points (.58 to .71) on SMC compared to .02 to .06 for 
all other comf igurations. One other' pattern that emerges 
is that Configurations 2 and 7 have smaller correlations 
for Test 2 than for 1 Test 1 on both PCA and SMC. All other 

f 

configurations have overlapping correlations for Test 1 and 
Test 2. 

Overall, the Cross Validation correlations follow the 
same general patterns as the Simul ated Data. The Cross 
Validation correlations are> at most + . 05 from the 
corresponding Simulated Data. 
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Table 7 contains the first four eigenvalues -from the 
principal components analyses. For all data sets, 
including the uni dimensional , there appears to be a strong 
•first factor and a much smaller second -factor. In 
addition, Con-figurations 5, 6, and 10 appear to have a 
third small -factor,, which -for Con-figuration d», Test 2 is 
almost as large as the second -factor- Con-figuration 9 
appears to have the largest -first -factor and Con-figuration 
5 appears to have the smallest. 

Recall that the configurations are arranged such that 
the correlation between traits increases from .00 for 
Configuration 1 to 1.00 for Configuration U (.0, .3, .4, 
.5, .5, .5, .6, .75, .9, .9, 1.00). With the exception of 
Configurations 1, S, 10,. and the unidimensional criterion, 
the size of the first eigenvalue increases as the 
correlation between traits increases. Also, as the test 
gets harder within a configuration (i.e. moving from the 
first entry for a given configuration through the last) , 
the size of the first eigenvalue decreases. The only 
exception is that the condition b a -b\ - o, Test 2 has the 
largest eigenvalue for Configuration 3, and for all other 
configurations, the condition b 2 - b% ■ 1, Test 1 has the 
largest. 

The first eigenvalues of Configurations 1, 7, 8, 9, 10, 
and Test 2 of Configuration 3 are clearly greater than the 
first eigenvalues of the unidimensional criterion. The 
first eigenvalues of Configuration 5 are clearly smaller 
than the corresponding eigenvalues of the unidimensional 
criterion. The first eigenvalues of the remaining 
configurations (2, 4, 6, and Test I, Configuration 3) are 
approximately equal to J the fir*t eigenvalues of the 
undi mensi onal criterion. 

In general, the Cross Validation data follow the same 
patterns as the Simulation Data, 
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Table 8 contain* the first -four eigenvalues from the 
•factor analyses using squared multiple correlations (SMC) . 
SMC -follows the same patterns as PCA. All data sets appear 
to have a strong -first factor and a much smaller second 
factor. Configurations 5, 6, and 10 appear to have a third 
small factor, uhicii for Configuration 6, Test 2 is almost 
as large as the second factor. Configuration 9 appears to 
have the largest first factor and Configuration 5 appears 
to have the smallest. 

The. size of the first eigenvalue generally increases as 
the correlation between traits increases, with the 
exception of Configurations 1,5, 10 and the unidimensional 
criterion. The size of the first eigenvalue decreases as 
the test gets harder within a configuration except that 
condition b a - b» - 1, Test 1 has the largest for all 
configurations (except Configuration 3 where the condition 
b* - b% =» 1, Test 2 is largest). - 

The first eigenvalues of Configurations 1, 7, S, 9, 10, 
and Test 2 of Configuration 3 are clearly greater than the 
first eigenvalues of the unidimensional criterion; The/ 
first eigenvalues of Configuration 5 are clearly smaller 
than the corresponding eigenvalues of the unidimensional 
criterion. The first eigenvalues of the remaining 
configurations (2, 4, 6, and Test 1, Configuration 3> are 
approximately equal to the first eigenvalues of the 
undimensional criterion. 

As with the principal components analyses, the Cross 
Validation data generally follow the same patterns as the 
Simulation Data. 



Parameter 



Estimation 
35 



Tillt 2 

Th» Hrrt fog Eigtaaisa *ro» fictnr fculym mIm SjjC 



JUdrtgd Dm 



Cfoii Validation 



CcBflwition b,«6, Uit » 2 3 4 | 2 3 4 



1 
2 

1 

2 

1 
2 

1 
2 



1 
2 

1 
2 

I 
2 

1 
2 

1 i 
2 

1 
2 




9 



1 



20 



I! 



S3 
C8 



9 

ERIC 



36 



Parameter Estimation 

36 



Table 9 contains the percent of variance accounted for 
and the cumulative percent of variance accounted for by the 
first four eigenvalues. The percent of variance accounted 
for by the first eigenvalue ranges from 16 to 40 percent. 
The percentages for Conf iyurdtions 1, 7, 8, 9* 10, and Test 
2 of Configuration 3 are mostly in the thirties and upper 
twenties, while for Configurations 2, 3 (Test 1 only), 4, 

5, 6, and U, the percentages are all in the upper teens and 
lower twenties. The Cross Validation values are all within 
£.01 of the Simulated Data values and are not reported 
here. 

Table 10 contains the percent of variance accounted for 
and the cumulative percent of variance accounted for by the 
first four eigenvalues for the factor analyses using SMC. 
For Configurations 2, 3 (Test 1 only) ,4, 6, and U, the 
first eigenvalues account for at least 101% of the variance 

in most conditions. For Configurations 1, 3 (Test 21, 5, 

■j 

7, 8, 9, and 10, the percent of variance accounted for is 
mostly between 90 and 10o\with the exception of conditions 
where b a - b% ~ 2, Test 2 (the hardest test in each 
configuration). The second eigenvalue accounts for 6 to 10 
percent of the variance in Configurations 2, 3 (Test 1>, 4, 

6, and U, and accounts for 9 to 14 percent of the variance 
far Configurations 1, 3 (Test 2), 7, S, 9, "and 10. Most 
notable is Configuration 5, for which the second Eigenvalue 
accounts for 19-22 percent of the variance. 

The percent pf variance accounted for by the third and 
fourth eigenvalues is near zero for all configurations and 
all conditions. The cross validation data values are all 
within + . 02 of the Simulated Data values and are not 
reported here. 
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Parameter Esti mation. Tables 11 and 12 contain 
comparisons of true versus estimated parameters. Table 11 
contains the correlations of the true and estimated item 
parameters for the Simulated Data. For Configurations 2, 3 
(Test 1), 4, 6, 10, and U, the correlations between 
difficulty on Trait 1 <bi) and estimated difficulty (b) are 
all .90 and above. In particular, the correlations for b 
and b» are all .98 and .99 for Configuration U. For 
Configurations 1, 3 (Test 2), 5, 7, 8, and 9, the 
correlations are all under .80 with most between .60 and 
.78. The most notable exception is that the correlations 
between b and b x for Configuration 5, Test 2 are .26, .36,. 
and .34 in order as the test gets harder. These are the 
only correlations between estimated difficulty and 
difficulty on Trait 1 that are under .59. 

For the correlations between estimated difficulty (b) 
and difficulty on Trait 2 (b*) , the best coi -elations (.87 
to .95) are for Configuration 10, the multidimensional 
configuration with the highest correlation (.90) between 
traits. All other correlations between b* and b a "are .78 or 
less. For Configurations 2, 3 (Test 1), 4 (Test 1), and 6, 
the correlations are all below .20. The /correlations for 
Configuration 5, Test 1 are .24 to .30. For Configurations 
1, 3 (Test 2), 4 (Test 2), 5 (Test 2), 7, 8, and 9, the 
correlations are all between .50 and .77. 

The correlations between b and b 3 are all within £.15. 
of the correlations between b and b\ for Configurations 1, 
3 (Test 2> , 7, 8, 9, and 10. For Configuration 4 (Test 2) 
these correlations are within .21 to .30 of each other and 
for Configuration 5 they are within .33 to .53. However, 
the largest differences for the correlations of estimated 
difficulty and difficulty on Trait 2 versus estimated 
difficulty and difficulty on Trait 1 occur tor 
Configurations . 2, 3 (Test 1), 4 (Test 1), and 6. These 
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correlations range from .78 to .96 less for r(b,b 9 ) than 
♦ or r (b,b* ) . 

Note the large differences and the increase in size of 
the correlations r(b 9 bi) and r<b,b 9 > for Configuration 5. 
For Test 1, the correlations of estimated difficulty with 
difficulty on Trait 1 are .77 or .78 and for Test 2 the 
correlations are .26 to ,36. The reverse is true for Trait 
2. Test 2 has the higher correlations (.67 to .77) and 
Test 1 has the lower (.24 to .30). 

The correlations of b with (a*bi + a 3 b 2 )/a are mostly 
.98 and higher. A few are in the lower -90s with one .89. 

The correlations of estimated discrimination (a) with 
true discrimination on Trait 1 '(a*) follow some of the same 
patterns as the correlations between estimated and true 
difficulty. The highest correlations between "a and a* are 
for Configurations 2, 3 (Test 1) , 4, 6, and U, as was true 

A 

for the correlations between b and bi, However, the only 
configurations with correlations in the -90s are 
Configurations 3 (Test 1), 4, and U. Configurations 2 and 
6 are in the .70s and .80s. Configurations 1, 3 (Test 2), 
7, 8, 9, and 10 are all between .40 and .72, with one 
exception. Condition bse - b* » 2, Test 2, Configuration 7 
is a very low .31. Configuration S has the lowest overall 
correlations on Trait 1 (.11 to .41), and with the 
exception of Configuration 4, Configuration U has the 
1 ar gest cor rel ati ons. 

For the correlations of estimated discrimination (a) 
with discrimination on Trait 2 (a a ) , the first 2 conditions 
of Test 1, Configuration 3 have the largest correlations 
(.80 and .83). All other correlations are .73 or less. 
Configurations 1, 8 (Test 2), 9, and 10 are mostly between 
. 50 and . 73. Correlati ons for Conf i guration 7 range from 
. 44 to . 55, and Conf igurati on 3 t Test 2) correlations range 
from .25 to .55. Configuration 5 correlations are mostly 
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in the ,30s. All correlations for Configurations 2, 4, 6, 
and 8 (Test 1) are less than .31. In particular, for 
Configuration 6, all correlations of estimated 
discrimination with discrimination an Trait 2 are negative. 

Note that, except for Configuration 10, the same 
configurations <2, 3 Test 1, 4, 6, and U) have the best 
correlations for Trait 1 discrimination as have the best 
correlations for Trait 1 difficulty. 

Configuration U has the best overall correlations (.90 
to .95) between a and aibx + a*b a . Correlations for 
Conf iguratiuns 2 (Test 1), 3 (Test 1), 4, 5, and 6 (Test l) 
are mostly in the upper .80s. Correlations for 
Configurations 1, 3 '.Test 2), 7, and 9 are mostly in the 
.70s and lower .80s. For Configurations 8 and 10, the 
correlations are mostly in the .60s and .70s. Test 2 for 
Configurations 2 and 6 has the lowest correlations: .53 to 
.54 for Configuration 2, Test 2, and .46 to .55 for 
Configuration 6, Test 2. Note the difference in size of 
correlations between Test 1 and Test 2 for Configurations 
2, 3, and 6. 

There appears to be no pattern for the correlations 
between c and c. These correlations all range between -.05 
to .79 with some of the poorest correlations ocurring for 
the uni dimensional criterion. 

With very few exceptions, all Cross Validation 
correlations are equal to the corresponding Simulated Data 
correlations. The exceptions are all within £.02 of the 
Si. Mlated Lata correlations. 
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Table 12 contains the correlations between the true ana 
estimated trait values for both Simulated Data and Cross 
Validation. The correlations between estimated ability (S) 
and ability on Trait 1 range from .44 to .92. 

Conf igurations 2, 3 (Test tJ, 4, 6, and U have correlations 
ranging from .79 to .92 with most correlations in the .90s. 
For each of these configurations, the smallest correlation 
is for condition b a - b 4 « 2, the hardest test. The 
correlations for Configurations 3 (Test 2), 5, 7, 8, 9, and 
10 are mostly in the -70s and -80s, except the hardest test 
of Configuration 5 with .67, and the easiest test of each 
of Configurations 7, 8, 9, and 10 which are .52, .55, .56, 
and .59, respectively. Overall, Configuration 1 has the 
smallest correlations between e and S x of all the 
configurations, ranging from a low of .44 for the easiest 
test and from .62 to .68 for the other conditions. 

In general, the correlations for Configurations 1,3 
(Test 2) , 5, 7, 8, 9, and 10 increase as the correlation 
between traits increases. 

The correlations of estimated ability with ability on 
Trait 2 are all approximately equal\ to the correlations of 
estinated ability with ability on Trait 1 for corresponding 
conditions of Configurations 1, 3 (Test 2), 5, 7, 8, 9, and 
10. For Configurations 2, 3 (Test 1), 4, and 6, the Trait 
2 correlations are all .28 to .56 less than the 
corresponding Trait 1 correlations. In general, these 
differences decrease as the correlations between traits 
increases. 

The correlations between e and are mostly in the 
.80s and -90s. The smallest correlations are .58 for the 
easiest test of Configurations 7, 8, 9, and . 6U for the 
easiest test of Configurations 1 and 10. Mil other 
correlations are .76 and above. The smallest correlations 
tor each o+ Configurations 2, 3, 4, 5, 6, and U are all on 
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the hardest test and range from .77 to .83. 

For Con-figurations 1, 3 (Test 2), 5, 7, 8, 9, and 10 t 
the correlations between 9 and 9~ are larger than the 
corresponding correlations of 9 with both 9 t and This 
difference decreases as the correlation between 9i and 9 a 
increases. Far Configurations 2, 3 (Test 1> , 4, and 6, the 
correlations between 9 and 9* are approximately equal to 
those between 9 and 9i and hence the correlations between 9 
and 9* are much larger than the corresponding correlations 
between 9 and 9» (since the correlations between 9 and 9i 
are much larger than the corresponding correlations between 
9 and 9 a for these configurations). This .difference 
decreases as the correlation between 9* and 9 2 increases. 

The Cross Validation correlations are all + • 03 of the 
corresponding Simulated Data correlations with one, 
exception. The correlations for Configuration 2, Test 1, 
b a - b» * 0 is . 17 points larger far the Simulated Data 
than for the Cross Validation data for both r(9, 9 a ) and 
r(3,y a ). 
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DISCUSSION 

The purpose of this research is to examine the effects 
of various multidimensional data configurations on 
parameter estimation with the three-parameter logistic 
model. Ten two-trait data configurations and one 
uni dimensional criterion were chosen. For each of these 
eleven configurations, three difficulty conditions were 
simulated. Data were generated using a multidimensional 
model for degrees of correlation between traits of .00 to 
.90 and one uni dimensional criterion. 

Simulations; The simulated item parameters and thetas 
were well within acceptable limits of the desired values. 
Means, standard deviations, and KR-20 values of 
number — correct scores indicated that all the conditions 
simulated realistic test configurations. 

Mul t i, di mensi onal i tv . Two factor analyses were examined 
in order to verify the mul tidi mensional i ty of the generated 
data. The factor analyses do not seem to consistently 
discriminate since for all dat* sets* including the 
uni dimensional criterion, there appears to be a strong 
first factor and a much smaller second factor. Hence, all 
tne data sets appear to be two-dimensional. The 
correlations between the first two factors do not appear to 
have any pattern that discriminates between the 
unidimensional criterion and the multidimensional data 
configurations. These correlations certainly -fail to 
follow the pattern of correlations between traits ranging 
from • 00 to • 90 -far the multidimensional configurations. 

This is similar to the McKinley and Reckase <1984> 
findings that correlations between factors did not follow 
the pattern of correlations between traits for 
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two-dimensional simulated data. However, for the McKinley 
and Reckase data, the size of the first eigenvalue 
decreased and the size of the second eigenvalue increased 
as the correlation between the two traits decreased. This 
is in direct contrast to the general trend that can be seen 
in the data presented here. In general, as the 
correlations between traits decreases, the size of the 
first eigenvalue increases. The multidimensional model 
used by McKinley and Reckase to generate their data is an 
extension of the Birnbaum (1968) two-parameter model that 
uses two discrimination parameters and one item parameter 
related to difficulty. Clearly, the multidimensional model 
used here and the multidimensional model used by McKinley 
and Reckase are generating different data configurations. 

Upon examining the first four eigenvalues of the 
principal components analyses (PCA) and the factor analyses 
using squared multiple correlations (SMC), a few patterns 
emerged that caused a rethinking/restructuring of the 
multi dimensionality (or lack of it) for each of the chosen 
configurations. The *true' item discriminations for each 
test were chosen to represent real data in that if an item 
dpes not measure a trait on a test then the discrimination 
for that item an that trait is zero. If an item does 
measure a trait on a test, then the discrimination of that 
> em on that tr-ai'.. is non-zero. 

The iter, discriminations for each test for each 
configuration wwre examined from the point of view that if 
most or all of the items "load" (discriminate) on one 
dimension only, then tha test is probably uni dimensi onal or 
near enough so to be called uni dimensi onal . If most or all 
of the items "load" on both dimensions, then the test is 
probabl> multidimensional. 

This rauses a grouping of the multidimensional 
co. if i gurati ons into three groups (actually four), based on 
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th. dimensionality of tha tasts. Clearly, both tests of 
Configurations 1, 5, 7-10, and Test 2 Configuration 3, are 
multidimensional since for Configurations 1, and 7-10, 
every item measures both dimensions, and for Configuration 
5, 17 items (over half) measure Trait 1 and 17 items 
measure Trait 2. Hence, Group Ml, a multidimensional 
group, consists of Configurations 1, and 7-10. Group M2, 
also a multidimensional group, consists of Configuration 5 
alone since it is the only configuration with about half of 
the items measuring each dimension. 

Another group (Group U) can be considered as a group 
with uni dimensional tests. Group (J consists of 
Configurations 2, 4, 6, and U* Both tests on these 
configurations and Test 1 Configuration 3 can be expected 
to be uni dimensional because Trait 2 is either not measured 
at all or i<* measured by only five of the 30 items, (i.e., 
Discriminations on Trait 2 are all zero or only five items 
"load 11 on Trait 2), For Test 1 of Configurations 2 and 4, 
Trait 2 is not measured at alia hence has zero 
discriminations for all items. Therefore, Test 1 for 
Configurations 2 and 4 are expected to be uni dimensional . 
Similar ily, Test 1 of Configurations 3 and 6, and Test 2 of 
Configurations 2, 4, and 6 are all probably unidimensional 
since 25 of 30 items do not load on the second trait. 

The two tests of Configuration 3 fit in different 
groups (Test 1 is unidimensional and Test 2 is 
multidimensional). For the sake of brevity of discussion, 
Test 1 will be considered as part of Group U and Test 2 as 
part of Group Ml , although, strictly speaking, the groups 
consi st of two-dimensional conf igu* at ions, not individual 
te?,t s. 

The factor analyses were then reexamined taking these 
groupings into account. The first eigenvalues of Group Ml 
are all greater than the first eigenvalues af the 
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uni dimensional criterion. For Group M2, the -first 
eigenvalue is clearly less than the first eigenvalue o-F the 
uni dimensional criterion. For Group U, the -first 
eigenvalue is approximately equal to the -first eigenvalue 
o-f the unidimensional criterion. Recall that the tests o-f 
Group U are considered unidimensional by the discrimination 
(loading) criterion. Note, then, that these -factor 
analysis results do support the grouping o-f the 
con-fi Durations into those with multidimensional tests 
versus those with unidimensional tests. When the 
con-figurations consist o-f unidimensional tests, the -first 
eigenvalues are approximately equal to the -first eigenvalue 
o-f the unidimensional criterion. When the con-figurations 
consist o-f multidimensional tests, the -first eigenvalues 
are either larger than or smaller than the first 
eigenvalues of the unidimensional criterion. 

Configuration 5 is similar to the McKinley and Reckase 
Test 1 data and the Group Ml configurations are similar to 
the McKinley and Reckase Test 2 data. In particular, 
keeping in mind that different generating models are 
involved, the McKinley and Reckase Oataset 2 with a 
correlation of .5 between traits is similar to 
Configuration 5 which also has a correlation of .5, and 
McKinley and Reckase Dataset 8 has the same correlation (0) 
as Configuration 1. For Dataset 2, the correlation between 
factors for the PCA was -.59, compared to correlations of 
.45 to .50 for Configuration 5. The correlations for 
Configurations 8 and 1 versus Datasets 5 and 8 are .55 to 
.60 and .46 to .57 versus .62 and -.57. These latter 
correspond more closely than the Configuration 5 versus 
Dataset 2 correlations. The McKinley and Reckase Test 2 
correlations between factors varied as correlations between 
traits decreased, as is also true for the corresponding 
data sets here (Group Ml configurations) . 
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The -first -four eigenvalues for Oataset 2 were 9.09, 
1.79, 1.30, and 1.28, indicating a strong first -factor and 
a smaller second one. In contrast, the Con-figuration 5 
eigenvalues indicate three -factors, a strong -first -factor 
and weaker second and third -factors. Again, the generating 
models seem to be simulating different things. For all 
McKinley and Reckase Test 2 datasets, the first four 
eigenvalues indicated a large first factor and an extremely 
small or nonexistent second factor. The size of the first 
eigenvalue decreases and the size of the second increases 
as the correlation between traits decreases. Although not 
as clear, the same general trend of a decrease in the first 
eigenvalue as correlations between ability decreases 
appears in the data reported here -for Group Ml. However, 
Group Ml appears to tjlave a small second factor, while the 
McKinley and Reckase T^st 2 data do not. 

The percent of variance accounted for by the first 
eigenvalue for the PCA also supports the pattern of 
groupings. Group Ml percentages are mostly in the 30s and 
upper 20s. Group U and the uni dimensional criterion 
percentages are all in the tens and lower 20s. Group M2 
also has percentages in the tens and lower 20s. 

For the SMC, the percent of variance accounted for 
discriminates between these groups even better than the PCA 
does. For Group U and the uni dimensional criterion, the 
first eigenvalues accounted for at least 1017. of the 
variance in most conditions, and the second eigenvalue 
accounts for 6-10/: of the variance. For Group Ml, the 
percent of variance accounted for by the first eigenvalue 
is between 90 and 100, except for the hardest test in each 
configuration, and the second eigenvalue accounts for 9-14^ 
of the variance. The first eigenvalue for Group M2 
accounts for 92-102X of the variance. The second eigenvalue 
for Group M2 accounts for 19-227. of the variance. 
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Therefore, clearly, Con-figuration 5 (Group M2) is 
multidimensional, with a dominant -first -factor. The 
multidimensional ity of Group Ml also verified by the 

"factor analyses using SMCT The SMC analyses support 
uni dimensional ity for Group U and the uni dimensional 
criterion, (i.e. Those multidimensional configurations that 
appear to have uni dimensional tests appear to be as 
unidimensional as the uni dimensional criterion. Those 
multidimensional configurations with multidimensional tests 
are supported by the SMC factor analyses as being composed 
of two-dimensional tests as they were constructed to be.) 

The multidimensional ity of Configuration S supports 
the McKinley and Reckase (1984) conclusions that when the 
two dimensions underlying the tests are independent of each 
other (i.e. each item discriminates on only one of the 
dimensions) then correlated abilities tend to yield 
response date with a dominant component. 

McKinley and Reckase also found that when the two 
dimensions underlying the test do not operate independently 
of each other (each item discriminates on both dimensions), 
then the effect of the correlations between abilities is 
the same, but less extreme, (i.e. correlated abilities tend 
to yield response data with a dominant component). The 
Group Ml data also appear to yield a dominant conponent. 
However, in contrast to the McKinley and Reckase Test 2 
data having extremely small or no second factors, the Group 
Ml data appear to have a small second factor and in some 
cases a third factor. 

It appears that the size of the correlation between 
traits used in generating the data with the 
mul tidimensior.al model used here was not as important in 
causing the data to be multidimensional as was the pattern 
o-f the loadings o-f the discriminations on the traits for 
the two tests. However, with the McKinley and Reckase 



52 



Parameter Estimation 

52 



modal, the dimensionality o-f the response data appears to 
depend on the ability correlations. Correlated abilities 
tended to yield response data with correlated dimensions 
(tended to be tridimensional) and uncorrelated abilities 
tended to yield response data with relatively uncorrelated 
dimensions (tended to be multidimensional). 

Parameter estimation. How well the item parameters 
were estimated appeared to depend to some extent on whether 
or not the tests were uni dimensional or multidimensional. 
For Group U and the uni dimensional criterion, b» was 
estimated well. Con-figuration 10 also has very good 
estimates for b*. For Group Ml, b x was not estimated as 
well. Especially notable is the extremely poor estimation 
of b* -for Con-figuration 5, Test 2 (Group M2) . 

Configuration 10 had the best estimation of b a . This 
would be expected since the correlation between b a and bi 
l is .80. The poorest estimations of b a occurred for Group U 
(except Configuration 4 Test 2) , and for Configuration 5, 
Test 1. Group Ml (except Configuration 10), Configuration 
4 Test 2, and Configuration 5 Test 2 estimates, while also 
poor, were a little better than Group U. 

Since LQGIST produces only one b, then b has to 
estimate both b* and ba. If the correlation between b x and 
b a is low, either o can estimate b* well, or b can estimate 
b a well, cr it can estimate both poorly, but it cannot 

estimate both well. If the correlation between bi and ba 

i 

is medium, then the estimate of both b x and ba by b can be 
medium to poor, or possibly one can be estimated well and 
the other medium to poor. If the correlation between b t 
and ba is high, then the estimation o-f both bi and ba must 
be about the same, ranging from poor to good. 

If the correlation between b t and ba is high, then 
LOGIST is trying to estimate practically the same 
difficulty values. If LUG I ST did not estimate both b t and 
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b a well, than perhaps LOGIST would be suspected of dcing a 
poor job of estimating the difficulty parameter. 

Note in Table 1 that the correlation between b» and b* 

_A S .. _*•!".?. f.°T _ cpnfiJiMrat_iens except Conf i guration- LO,- 

where the correlation is .8 and Con-figuration 4 Test 2, 
where the correlation is .5. For Con-f igur at inn 10, since 
the correlation between bt and b a is so high, the excellent 
estimation of both would be expected. Similarily, the 
medium correlation between b» and b -for Configuration 4 
Test 2 could be expected since the correlation between b* 
and b a is .5. However, since there is zero correlation 
between b» and b a -for all other conditions, and bt was well 
estimated for Group U, then b a could not be estimated very 
wel 1 . x 

Both b» end b a were estimated very poorly for 
Configuration 5. vMote the large differences and the 
increase in size of the correlations rtb,b») and r(b,b a >. 
For Test 1, the correlations of estimated difficulty with 
difficulty on Trait 1 are .77 or . 7G and for Test 2 the 
correlations are .26 to .36. The reverse is true for Trait 
2 - Test 2 has the higher correlations (.67 to .77) and 
Test 1 has the lower (.24 to .30). These two sets of 
correlations are the lowest of all configurations for the 
corresponding traits. Configuration 5 Test 2 has the 
lowest correlations for Trait 1 and Configuration 5 Test 1 
has the lowest for Trait 2 for the multidimensional tests. 
For Group Mi, the estimation of b a was mediocre, not as 
poor as the Group U estimates and nearly as good as the 
Group Ml estimates. 

In spite of the fact that LOGIST is intended for only 
uni dimensional tests, it has done an excellent joii of 
estimating difficulty ; for these various multidimensional 
data sets. When the correlation between b» and b a was 
high, the estimation of both b, and b= was good. When the 
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correlation between bi and b* was medium, b» was well 
estimated and b a was estimated neither poorly nor well. 
Similarly, when the correlation between bi and b a was zero, 
and Trait 2 Mas measured by -few or no items, then the b 
parameter was estimated well for the trait that was 
measured and very poorly -for the trait that was not 
measured. This would clearly indicate support -for the 
belief that LOGIST is doing the' task for which it is 
intended, at least as -far as estimating difficulty is 
concerned. 

The diff erences between the correlations r(b,bi) and 
r(b,b») also tend to follow the grouping pattern. The 
largest differences are for Group U, where bi is estimated 
very well and b a is estimated extremely poorly. For Group 
M2, Test 1 Configuration 5 has fairly good estimates of b* 
and poor estimates of b = , whifle Test 2 has fairly good 
estimates o-f b a and poor estimates of b». This follows 
logically from the fact that nearly half the *tems (13 of 
30) do not measure Trait 1, 13 others do not measure Trait 
2, and only 4 items measure both traits. Therefore, 
approximately half the items "load" (discriminate) on frait 
1 and half on Trait ^. This allows one trait to be 
estimated fairly well while the other is estimated poorly. 

It appears that how well the difficulty parameter is 
estimated depends on an interaction between whether or not 
the test i s uni dimensional according to the "loading" 
criterion, and how closely correlated the difficulty 
between the two traits is. If the item clearly measures 
one trait and noc the other, the difficulty parameter on 
the trai t measured 1 s esti mated wel 1 . However , if the test 
measures both traits, then mediocre estimation of the 
difficulty parameter for both traits can be expected unless 
the correlation between difficulty on both traits is high- 
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In ail cases, rtb,b*) is very high. Ihis supports the 
accuracy of Yen's (1984b) equation for predicting how the 
uni dimensional difficulty parameter is related to the 
multidimensional difficulty parameters. 

The estimation of the discrimination values follow some 
of the grouping patterns established. Discrimination for 
the uni dimensional criterion was estimated better than for 
any other configurations. For Group U, a* is estimated 
well. For Group Ml, the estimation of a x is mostly 
mediocre. Group M2 has the poorest overall correlations 
for discrimination on Trait 1. 

The estimation . of discrimination of Trait 2 <a a ) does 
not seem to follow a useful pattern. Most correlations are 
mediocre to poor. In general, the estimation of a 3 for 
Group Ml is better than that for Group U. The size of the 
correlation between a* and a 2 appears to have no effect on 
the estimation of either a* or a*. However, the estimation 
of a 4 and a a does appear to be dependent on whether or not 
the tests are considered to be uni dimensional according to 
the discrimination "loading" criterion. If the test is con- 
sidered to be uni dimensional according to this criterion, 
then ai is well estimated and a a is poorly estimated. If 

the test is considered to be multidimensional, then the 

V 

estimation of both ai .and a a is Mediocre to poor. 

The instability of the discrimination parameter shown 
here is comparable to Yens (1980) findings of unstable 
item discrimination estimates found for items from an 
achievement test. Yen hypothesizes that a possible cause 
for the instability in the estimations for the real data 
could be a carefulness dimension. However, Yen used very 
small sample sizes (183-66S), which have been shown to 
yield unstable parameter estimates. 

The results -found here for the discrimination estimates 
*or Group U configurations support the Reckase (1977, 1979) 
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conclusion that the three-parameter model computes item 
discrimination parameter estimates related to one factor. 
This is the result that woult , be theoretically predicted 
<Christof f ersson, 1975). Results for the other 

multidimensional con-figurations are not so clear cut. If 
LOGIST were drawn to one dominant factor, one would expect 
the discriminations for one of the two traits to be 
estimated better than the other. However, in the Ml and M2 
groups, this is not the case. In all conditions, the 
di scr i mi nati on for both traits showed , at most , medi ocr e 
estimation. This supports the Orasgow and Parson's (1983) 
conclusion that for some multidimensional data 
configurations LOGIST is not drawn to a general factor. 

Equation 15 predicts that if both traits are equally 
influencial, then the discrimination of both traits will be 
given equal weight in obtaining a. The low correlations 
between a and both a x and a 2 far Group M2 support the 
accuracy of this equation. However, rt^a*) is very high 
for all conditions of Group M2. As with the difficutly 
parameters, the accuracy of Yen's (1984b) equations far 
predicting the relationship between the unidimensional and 
the multidimensional item parameters is upheld. 

Note also that it is assumed here "that a high 
correlation implies that the parameters are well estimated. 
This might no* be true. If the two sets of discriminations 
had equal standard deviations but dif f erent^means, th~ 
correlation could be 1.00, even though none of the 
corresponding values are equal . However , thi s would not 
change the conclusions drawn from the low correlations 
found here. Obviously, if LOGIST ware drawn to one group 
factor *i.e. computes item discrimination estimates 
related to one factor ) , then the i tern di scr i mi nati on 
parameters -for one of the two traits should have been 
better esti mated. 
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The estimation of the c parameter is poor to mediocre 
•for all conditions of all con-figurations. This is 
consistent with the Ree and Jensen <1983) results of low 
correlations between c and estimated c (.031 to .315) for 
sample sizes from 250 to 2000. Their data were generated 
using the one-dimensional threu-parameter model. No 
comparisons could be made for* c estimated from data 
generated with a three-parameter multidimensional model, 
since none of the multidimensional research reported data 
for c. 

In order to adequately estimate the guessing parameter 
a substantial number of low ability examinees are required 
(Lord, 1975; Hambleton St Martois, 1983; Ree & Jensen, 1983.; 
Wingersky, 1983). For very easy items or items that do not 
discriminate well, the item response Jf unction will not 
became asymptotic at the lower end of the range of 
abilities in the sample. If there are no or f<2w examinees 
at the lower end of the range of abilities, there is no 
information with which to estimate c. Hence, LOG I ST 
estimates a (the same) fixed c for all such items. There 
may have been too few examinees with low test scores for 
this data. However, this appears implausible as explained 
below. 

It would be expected that as the difficulty of the 
items increases there would be more examinees with low test 
scares. Hence,— c - should be better estimated for the harder 
tests. This certainly does not appear to be the case here. 
Recall that the difference in average difficulty of the 
easiest to the hardest tests in each configuration is 2.00*. 
The difficulty parameter for each item had 1.0 addeo or 
subtracted to obtain Test 1 and Test 2 with a 2.0 
difference between average difficulty. Therefore, every 
item for the harder test of the each configuration has a 
difficulty value that is 2.0 larger than the difficulty 
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value -for some item on the easier test. Hence, c should be 
better estimated -for the hardesr tests. The correlations 
between c and estimated c do not increase as the test gets 
harder. In -fact for some can-figurations, the opposite 
occurs. In general, no consistent pattern occurs at all. 

Lord (1973) and others have shown that LOG I ST parameter 
estimates for the three— parameter model are adequate i-f N > 
1000 and n > 50. The item parameters used here were 
estimated using 4000 simulees (20OO low + 2000 medium). 
(All 6000 simulees were not used since the program Used to 
get the response vector data from tape to LOG 1ST could not 
handle 6000 simulees and 60 items.) Possibly the 30-item 
tests were too small -for good parameter estimation. 

The correlations between true and estimated trait 
values also -follow the mul ti dimensional /uni dimensional 
grouping. Group U and the uni dimensional criterion have 
the highest correlations between estimated ability and 
ability on Trait I. Groups Ml and M2 have correlations 
about .1 to .2 lower. Con-figuration 1 has the lowest. 
Notice for Groups Ml and M2, that in general, the 
correlation between estimated £ and 0i increases as the 
corr elation between 9* and 9« increases. 

The difference between Groups Ml and M2 versus Group U 
is clear cut when the correlations between estimated 0 and 
©a are compared to the correlations between estimated e and 
©i. For Groups Ml and M2, these correlations are nearly 
equal; for Group U, the correlations of estimated ability 
with ability on Trait 2 are all .23 to .56 less than the 
corresponding correlations of estimated ability with 
ability on Trait 1. these differences mostly decrease as 
the correlation between traits decreases. 

Apparently, when both tests are uni dimensional 
according to the "loading" criterion, ability on Trait 1 is 
well estimated (as well as the uni dimensional criterion is 
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estimated), out Trait 2 is very poorly estimated. these 
differences decrease (i.e. Trait 2 is estimated better) as 
the correlation between traits increases. When one or doth 
tests are considered to be multidimensional, then the 
estimation of both traits is approximately the same. Also, 
when one or both tests are considered to be 
multidimensional, as the correlations between traits 
increases, the correlations between estimated 6 and true e 
increase until they are as good or better than the 
correlations -for the corresponding con-figurations of the 
uni dimensional criterion (except for the easiest test where 
the correlation remains less than .60). 

For most tests, the correlation decreases as the test 
gets harder. The reason for the ocurrence of the 
noticeably smaller correlation on the easiest test in each 
of the Group Ml configurations appears to be due to 
excessive loss of simulees due to zero or perfect scores. 
Examination of Table 12 shows that five configurations (1, 
7, 8, 9, 10) have a correlation under .60 for the easiest 
test for both traits. A gap of .19 to .34 exists between, 
the easiest test and the next test in all of these 
configurations compared to virtually equal correlations for 
these two tests for all other configurations. For these 
five configurations, the' loss of simulees (1150, 1412, 
1336, 1442, 1079, respectively) is noticeably larger than 
nearly all other conditions for all other configurations. 
It appears that a loss of over 10O0 examinees will 
noticeably lower the correlations between estimated thetas 
and true thetas on both dimensions. 

Examination of the situations in which examinees are 
lost due to zero or perfect scores reveals other 
interesting results. From Table 12, the Group Nl 
configurations show a lass ot 1/ to 1442 simulees 
(.3-24,07.) witn an average loss of 7. 17., 9.97., 9.67., 11.17., 
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and /.4V. + or Configurations 1, and 7-1U respectively. Group 
M2 (Configuration 5) shows a loss ot 7 to 694 sxmulees 
<. 1-11.6'/.; with an average loss of 3.4% for the six 
conditions of Configuration 5. Group U shows a loss of S 
to 687 <- 1-11.4"/.) with an average loss of 2.57. f ^.6V. ? and 
3.37. for Configurations 2 f 4, and 6, respectively. [his is 
comparable to the uni dimensional criterion which shows a 
loss of 8 to 569 (.1-9.57.) with an average loss of 3.07.. 

Configuration 3 has an average loss of 4.27. over the 
six conditions but breaking it up into Test 1 (4.97.) versus 
Test 2 (3.47.) again allows a comparison of Test 1 with the 
other unidimensionai tests and Test 2 with the other 
multidimensional tests. The Group U Test 1 average losses 
per test are all under 6.17. while the Group Ml are all over 
11.97.. Configuration 3 -Test 1 clearly falls in with the 
other unidimensionai configurations, as would be expected. 
Group U Test 2 average losses are all under 1.17., while the 
Group Ml are all over 2.27... Configuration 3 Test 2 falls 
in with the multidimensional configurations as would be 
expected. Note that Configuration 3 is the only 
configuration with a large gap between for Test 1 b s — 
bi = 0 and Test 2 b a - bi = 0. The multidimensional test 
lost 35* more simulees than the unidimensionai test. 

Configuration 5 is again in a group by itself with Test 
1 (6. average loss being greater than al 1 Group U 

configurations and less than all Group Ml configurations, 
and Test 2 falling within the Group U percentages. The 
average loss per all six conditions of Configuration 5 is 
3.47., which again falls within the Group U percentages. 
Although not as dramatic, Yen's <19S4b) results using N = 
1000 al so show the drop in correlations between estimated 
theta and both true thetas , and the i ncrease i n ex ami nees 
lost due to zero or perfect scores in the easiest test. 



61 



Parameter Estimation 

61 

Clearly, an all cases except Configuration 5, the 
multidimensional tests lost morp simulees to zero or 
perfect scores than the undimensional tests. This could 
have serious implications -for item and ability parameter 
estimation. 

How well both traits are estimated, then, appears to 
depend on how strong the correlation be, ween true ability 
on the two traits is, on whether the two tests are 
uni dimensional or multidimensional according to the 
"loading" criterion, and on how many examinees are lost due 
to zero or perfect scores. The better the correlation 
between true ability on . the itwo traits, the better the 
estimation of both traits. <0f course, LOGIST is meant to 
estimate only one trait.) If one or both tests are 
considered to be multidimensional according to the 
"loading" criterion, then both traits are estimated fairly 
well; however, if both tests are uni dimensional , then one 
trait is well estimated and one is estimated poorly. If 
over 100O examinees are lost, the traits are poorly 
estimated, despite N (approximately 3000) being larger than 
the criterion set by previous researchers. These results 
support several studies where the same conclusions were 
reached (Chr i stof f ersson , 1973; Drasgow i< Parsons, 1983; 
McKinley, 1983; Reckase, 1977, 1979; Yen, 1984d) . 

Reckase (1977, 1979) found that when there is a 
dominant first factor present in multidimensional data, 
then the three-parameter model estimates that single 
factor. The Group U data sets here clearly have a dominant 
first factor. Trait 1 is well estimated and Trait 2 is not 
well estimated. This is also true for Configuration 3 Test 
1. Vhese data set results support the Reckase findings. 
Just as clearly, the other configurations have nearly equal 
estimation of both traits and these estimates get better 
for both traits as the correlation between 9» and 9 a 
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increases. I his does not support the Keckase findings. 
However, Reckase s (197/, 1979, 1981b) conclusion was that 
although unstable item parameter estimates may result, good 
ability estimates can be obtained despite the data being 
multidimensional. This conclusion is supported here. The 
descrepancies -found here between the Reckase -findings and 
those indicated by the data presented in this paper may 
very well be due to the different sample sizes (Reckase N = 
1000) , the different generating models (Reckase used a 
linear factor analysis model), and sampling error. 

Drasgow and Parsons (1983) found that when one .trait is 
sufficiently prepotent (dominant), then a unidimensional 
model provides a good description of multidimensional data 
sets. The results shown here for the Group U 
configurations support, this conclusion. However, the 
conclusions from this data go beyond that, indicating that 
even with two-dimensional data, the trait estimates arfe 
good enough to conclude that a unidimensional model can 
describe multidimensional (two-dimensional) data well at v 
least when the correlation between Qi and 9* is above .5. 

Yen's (1984b) mathematical predictions support the hyp- 
othesis that multidimensional data analyzed by the unidim- 
ensional three-parameter model result in a unidimensional 
trait that is a combination of the underlying traits. If 
the test involves traits that influence all or most of the 
items the prediction is that the underlying , true traits 
have approx i mat el y equal i nf 1 uence * n deter mi ni ng est l mated 
9. Her simulated results confirm v.«at prediction, as do 
the correlations of true and estimated thetas for Group Ml 
and M2 configurations here. If the test involves i ndepen- 
dent traits, one of whi ch i nf luences onl y a few l terns , that 
trait is ignored in the definition of the unidimensional 
three-parameter trait. Group U correlations support this 
second prediction as do the Reckase (1979) results. 
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Summary and conclusions. It is accepted knowledge 
that many existing standardized tests, such as most 
achievement tests and many aptitude tests, do not satisfy 
the undimensional i ty assumption of the three-parameter 
logistic model (Bejar, 19S3; Bock; 1979; Hambieton it Cook, 
1977; Hutten, 1980; Kingston & Dorans, 1982; Reckase, 1977, 
1979). Therefore, the question to be answered is not 
whether the assumption is satisfied but whether a specific 
use of the model is robust to violations of the assumption 
(Hambieton & Cook, 1977; Hambieton, Swaminathan, Cook, 
Eignor, & Gifford, 1978; Reckase, 1981). Hambieton et al . 
<1978) and Yen (1984a, 1984b > presented evidence that the 
models are robust to some departures. The results of this 
research present more. 

Factor analyses were used in order to verify whether 
the data were truly multidimensional or not. The factor 
analyses supported a division of the simulated 
multidimensional data sets into groups according to how the 
tests 11 load 1 ' (discriminate) on the two dimensions. The 
tests either both "load" heavily on both dimensions (both 
tests are multidimensional), one test "loads' 1 heavily on 
one dimension and the other test M loads" heavily on the 
same dimension (both tests are uni dimensional ) , one test is 
unidimensional and one multidimensional . or one test 
"loads' 1 heavily on one dimension and the other test loads 
heavily on the other dimension. 

Although the strength of the correlation between the 
two generating traits seemed to have little effect on the 
quality of the parameter estimation, there is evidence that 
the unidimensional l ty or multidimensional lty of the tests, 
as determined by both factor analyses and the 
discrimination loadings on the two dimensions, does hrave an 
ef r ect on i tem parameter est l mat l on . 
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When bath tests are uni dimensional , both loading most 
heavily on the same dimension, then a* and b* are well 
estimated, and a a and b 2 are poorly estimated. If both 
tests to be equated are multidimensional, then b& is 
estimated fairly well, b a is poorly estimated, and a* and 
a 3 are mostly poorly estimated. If both tests are 
multidimensional with Test 1 loading heavily on one 
di mensi on and Test 2 1 oadi ng heavi 1 y on the other 
dimension, then b* and ba, a* and a a are all poorly 
estimated. If Test 1 is unidimensional and Test 2 is 
mul tidi mensi nnal , then for Test 1 a x and b % are wel 1 
estimated and a 2 and b s are poorly estimated, while for 
Test 2 b* is fairly well estimated, and b 2 , a*, and a s are 
poorly estimated. The estimation of the c parameter was 
mediocre to poor for all conditions of all configurations. 

The results of this research indicate that the poorest 
item parameter estimates occur for the situation in which 
one test is unidimensional and one is multidimensional, 
such as a situation in which Trait 2 is measured by only a 
few items on one test and by most or all of the items on 
the other test. This situation appears to be worse than if 
both tests are unidimensional or both are multidimensional. 

In conclusion , these resul ts indi cate that use of the 
three-parameter logistic model is as good, in most 
i nstances , for parameter estimation of mul tidi mensi onal 
data as it is for unidimensional data for the types of 
condi tions studied in this research. Caution should be 
exerci sed , however , when one test i s uni di mensi onal and one 
l s multidimensional , such as occurs when a higher level has 
a few items measuring a trait that a lower level does not 
measure* 
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